Emergence and resilience of social networks: a general theoretical framework 
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We introduce and study a general model of social network formation and evolution based on the concept 
of preferential link formation between similar nodes and increased similarity between connected nodes. The 
model is studied numerically and analytically for three definitions of similarity. In common with real-world 
social networks, we find coexistence of high and low connectivity phases and history dependence. We suggest 
that the positive feedback between linking and similarity which is responsible for the model's behaviour is also 
an important mechanism in real social networks. 

PACS numbers: 



I. INTRODUCTION 

There is a growing consensus among social scientists that 
many social phenomena display an inherent network dimen- 
sion. Not only are they "embedded" in the underlying so- 
cial network 1 1 ] but, reciprocally, the social network itself is 
largely shaped by the evolution of those phenomena. The 
range of social problems subject to these considerations is 
wide and important. It includes, for example, the spread of 
crime 1 2, 3 ] and other social problems (e.g. teenage pregnancy 
(HHt). the rise of industrial districts oflUlL and the estab- 
lishment of research collaborations, both scientific l^ floll and 
industrial 00. Throughout these cases, there are a num- 
ber of interesting observations worth highlighting: 

(a) Sharp transitions: The shift from a sparse to a highly 
connected network often unfolds rather "abruptly," i.e. in a 
short timespan. For example, concerning the escalation of so- 
cial pathologies in some neighborhoods of large cities, Crane 
jj] writes that "...if the incidence [of the problem] reaches a 
critical point, the process of spread will explode." Also, con- 
sidering the growth of research collaboration networks, Goyal 
et al. II Oil report a steep increase in the per capita number of 
collaborations among academic economists in the last three 
decades, while Hagerdoorn II ill reports an even sharper (ten- 
fold) increase for R&D partnerships among firms during the 
decade 1975-1985. 

(b) Resilience: Once the transition to a highly connected 
network has taken place, the network is robust, surviving 
even a reversion to "unfavorable" conditions. The case of 
California's Silicon Valley, discussed in a classic account by 
Saxenian |7], illustrates this point well. Its thriving perfor- 
mance, even in the face of the general crisis undergone by 
the computer industry in the 80's, has been largely attributed 
to the dense and flexible networks of collaboration across 
individual actors that characterized it. Another intrinsically 
network-based example is the rapid recent development of 
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Open-Source software (e.g. Linux), a phenomenon sustained 
against large odds by a dense web of collaboration and trust 
Hi 311 . Finally, as an example where "robustness" has negative 
rather than positive implications, Crane |4] describes the dif- 
ficulty, even with vigorous social measures, of improving a 
local neighborhood once crime and other social pathologies 
have taken hold. 

(c) Equilibrium co-existence: Under apparently simi- 
lar environmental conditions, social networks may be found 
both in a dense or sparse state. Again, a good illustration 
is provided by the dual experience of poor neighborhoods 
in large cities jj], where neither poverty nor other socio- 
economic conditions (e.g. ethnic composition) can alone ex- 
plain whether or not there is degradation into a ghetto with 
rampant social problems. Returning to R&D partnerships, 
empirical evidence II ill shows a very polarized situation, 
almost all R&D partnerships taking place in a few (high- 
technology) industries. Even within those industries, partner- 
ships are almost exclusively between a small subset of firms 
in (highly advanced) countries. l3lll 

From a theoretical viewpoint, the above discussion raises 
the question of whether there is some common mechanism at 
work in the dynamics of social networks that, in a wide variety 
of different scenarios, produces the three features explained 
above: (a) discontinuous phase transitions, (b) resilience, and 
(c) equilibrium coexistence. Our aim in this paper is to shed 
light on this question within a general framework that is flexi- 
ble enough to accommodate, under alternative concrete spec- 
ifications, a rich range of social-network dynamics. 

The recent literature on complex networks has largely fo- 
cused on understanding what are the generic properties aris- 
ing in networks under different link formation mechanisms. 
Those properties are important to gain a proper theoretical 
grasp of many network phenomena and also provide useful 
guiding principles for empirical research. The analysis, how- 
ever, has been mostly static, largely concerned with features 
such as small-world Il6ll or scale-free fl7ll networks. In con- 
trast, our approach in this paper to the issue of network forma- 
tion is intrinsically dynamic, the steady state being a balance 
of link formation and removal. 
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We consider a set of agents - be they individuals or orga- 
nizations - who establish bilateral interactions (links) when 
profitable. The network evolves under changing conditions. 
That is, the favorable circumstances that led at some point 
to the formation of a particular link may later on deterio- 
rate, causing that link's removal. Hence volatility (exogenous 
or endogenous) is a key disruptive element in the dynamics. 
Concurrently, new opportunities arise that favour the forma- 
tion of new links. Whether linking occurs depends on factors 
related to the similarity or proximity of the two parties. For 
example, in cases where trust is essential in the establishment 
of new relationships (e.g. in crime or trade networks), linking 
may be facilitated by common acquaintances or by the exis- 
tence of a chain of acquaintances joining the two parties. In 
other cases (e.g. in R&D or scientific networks), a common 
language, methodology, or comparable level of technical com- 
petence may be required for the link to be feasible or fruitful 
to both parties. 

In a nutshell, our model conceives the dynamics of the net- 
work as a struggle between volatility (that causes link decay) 
on the one hand, and the creation of new links (that is depen- 
dent on similarity) on the other. The model must also specify 
the dynamics governing inter-node similarity. A reasonable 
assumption in this respect is that such similarity is enhanced 
by close interaction, as reflected by the social network. For 
example, a firm (or researcher) benefits from collaborating 
with a similarly advanced partner, or individuals who inter- 
act regularly tend to converge on their social norms and other 
standards of behavior. 

We study different specifications of the general framework, 
each one embodying alternative forms of the intuitive idea 
that "interaction promotes similarity." Our main finding is that 
in all of these different cases the network dynamics exhibits, 
over a wide range of parameters, the type of phenomenology 
discussed above. The essential mechanism at work is a pos- 
itive feedback between link creation and internode similarity, 
these two factors each exerting a positive effect on the other. 
Feedback forces of this kind appear to operate in the dynam- 
ics of many social networks. We show that they are sufficient 
to produce the sharp transitions, resilience, and equilibrium 
co-existence that, as explained, are salient features of many 
social phenomena. 



II. THE MODEL 

Consider a set J\f = {1, . . . , n} of agents, whose interac- 
tions evolve in continuous time t. Their network of interaction 
at some t is described by a non-directed graph g(t) C {ij : 
i € Af, j E TV}, where ij(= ji) E git) iff a link exists be- 
tween agents i and j. The network evolves in the following 
manner. Firstly, each node i receives an opportunity to form a 
link with a node j, randomly drawn from Af (i 7^ j), at rate 1 
(i.e. with a probability dt in a time interval [t, t + dt) ). If this 
link ij is not already in place, it forms with probability 



P{ij -> g(t)} = 



1 if dij (i) < d_ 
e if d^ (t) > d 



(1) 



where dij (t) is the "distance" (to be specified later) between 
i and j prevailing at t. Thus if i and j are close, in the sense 
that their distance is no higher than some given threshold d, 
the link forms at rate 1; otherwise, it forms at a much smaller 
rate e <C 1. Secondly, each existing link ij E g(t) decays 
at rate A. That is, each link in the network disappears with 
probability Xdt in a time interval [t,t + dt). 

We shall discuss three different specifications of the dis- 
tance d^, each capturing different aspects that may be rele- 
vant for socio-economic interactions. Consider first the sim- 
plest possible such specification where dij (t) is the (geodesic) 
distance between i and j on the graph g(t), neighbors j of i 
having dij (t) = 1, neighbors of the neighbors of i (which are 
not neighbors of i) having <%(i) = 2, and so on. If no path 
joins i and j we set d^ (t) —00. 

This specific model describes a situation where the forma- 
tion of new links is strongly influenced by proximity on the 
graph. It is a simple manifestation of our general idea that 
close interaction brings about similarity - here the two met- 
rics coincide. When d > n — 1, the link formation process 
discriminates between agents belonging to the same network 
component (which are joined by at least one path of links in 
g) and agents in different components. Distinct components of 
the graph may, for example, represent different social groups. 
Then Eq. Q captures the fact that belonging to the same so- 
cial group is important in the creation of new links (say, be- 
cause it facilitates control or reciprocity fl4Hl5ll "). 

Consider first what happens when A is large. Let c be the 
average connectivity (number of links per node) in the net- 
work. The average rate nXc/2 of link removal is very high 
when c is significant. Consequently, we expect to have a very 
low c, which in turn implies that the population should be frag- 
mented into many small groups. Under these circumstances, 
the likelihood that an agent i "meets" an agent j in the same 
component is negligible for large populations, and therefore 
new links are created at a rate almost equal to ne. Invoking 
a simple balance between link creation and link destruction, 
the average number of neighbors of an agent is expected to be 
c ~ 2e/A, as is indeed found in our simulations (FigQ. 

As A decreases, the network density c increases gradually, 
but then, at a critical value Ai, it makes a discontinuous jump 
(Fig. Q to a state containing a large and densely intercon- 
nected community covering a finite fraction of the population 
(the giant component). Naturally, if volatility A decreases fur- 
ther, the network becomes even more densely connected. But, 
remarkably, if volatility increases back again beyond the tran- 
sition point Ai , the dense network remains stable. The dense 
network dissolves back into a sparsely connected one only at 
a second point A2. This phenomenology characterizes a wide 
region of parameter space (see inset of Fig. ^ an d is qualita- 
tively well reproduced by a simple mean field approach (see 
appendix). 

A similar phenomenology occurs when d = 2, i.e. when 
links are preferentially formed with "friends of friends", in 
an appropriate parameter range. 1 32] This is reminiscent of a 
model that was recently proposed 1 18] to describe a situation 
where (as e.g. in job search 13011 ') agents find new linking op- 
portunities through current partners. In 1 18] agents use their 
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We assume each agent revises its attribute at rate v, choosing 
Xi dependent on its neighbours' Xj& according to: 



X 



FIG. 1: Mean degree c as a function of A for e = 0.2 when dij is 
the distance on the graph and d > n — 1. The results of a mean 
field theory for n = oo (solid line) is compared to numerical sim- 
ulations (x) starting from both low and high connected states with 
n — 20000. The dashed line corresponds to an unstable solution 
of the mean field equations which separates the basins of stability of 
the two solutions. Indeed the low density state, for finite n, "flips" 
to the high density state when a random fluctuation in c brings the 
system across the stability boundary (i.e. when a sizable giant com- 
ponent forms). These fluctuations become more and more rare as n 
increases. Inset : Phase diagram in mean field theory. Coexistence 
occurs in the shaded region whereas below (above) only the dense 
(sparse) network phase is stable. Numerical simulations (symbols) 
agree qualitatively with the mean field prediction. The high (low) 
density state is stable up (down) to the points marked with x (o) and 
is unstable at points marked with o (+). The behavior of c along the 
dashed line is reported in the main figure. 



links to search for new connections, whereas here existing 
links favor new link formation. In spite of this conceptual dif- 
ference, the model in Ref. 1 18] also features the phenomenol- 
ogy (a)-(c) above, i.e. sharp transitions, resilience, and phase 
coexistence. 

We now consider an alternative specialization of the gen- 
eral framework where link formation requires some form of 
coordination, synchronization, or compatibility. For example, 
a profitable interaction may fail to occur if the two parties do 
not agree on where and when to meet, or if they do not speak 
the same languages, and/or adopt compatible technologies and 
standards. In addition, it may well be that shared social norms 
and codes enhance trust and thus are largely needed for fruit- 
ful interaction. 

To account for these considerations, we endow each agent 
with an attribute xt which may take a finite number q of dif- 
ferent values, Xi S {1,2, ... ,q}. xi describes the internal 
state of the agent, specifying e.g. its technological standard, 
language, or the social norms she adopts. The formation of a 
new link ij requires that i and j display the same attribute, i.e. 
Xi = Xj . This is a particularization of the general Eq. Q with 
dij = 5 Xi ,xj and d = 0. For simplicity we set e = since 
in the present formulation there is always a finite probability 
that two nodes display the same attribute and hence can link. 



P{Xi(t) = x} 



exp 



■,Xj(t) 



(2) 



where ft tunes the tendency of agents to conform with their 
neighbors and Z provides the normalisation. This adjustment 
rule has a long tradition in physics 1 19] and also occurs in the 
socio-economic literature as a model of coordination (or so- 
cial conformity) under local interaction |2(| 0, l22ll . This 
is another manifestation of our general idea that network- 
mediated contact favors internode similarity. We focus on the 
case where such a similarity-enhancing dynamics proceeds at 
a much faster rate than the network dynamics. That is, 1 
so that, at any given t where the network g(t) is about to 
change, the attribute dynamics on the Xi have relaxed to a 
stationary state. The statistics of this state is provided by the 
Potts model in physics, which has been recently discussed for 
random graphs I23l 12411 . We refer to the appendix for details 
and move directly to discussing the results. 

For a given ft, under strong volatility (A ^> 1) the link den- 
sity is very low, there is no giant component and agents i, j 
chosen at random (for n large) are not coordinated (P(xi — 
xj) = 1/q). Hence links form at a node at rate 2/q. A 
simple balance of link formation and decay rates implies that 
c = 2/(q\) in this case. When A decreases, network den- 
sity c increases. First, it does so gradually but at a critical 
point Ai(/3) c becomes sufficiently large that the Xi& within 
the giant component (whose existence is necessary for coor- 
dination) become coordinated. Link formation increases since 



now P(xi 



> 1/q and this in turn increases the coor- 



dination. This positive feedback causes a sharp transition to 
a coordinated, more highly connected state. Once this sharp 
transition has taken place, further decreases in A are simply 
reflected in gradual increases in network density. On the other 
hand, subsequent changes of A in the opposite direction are 
met by hysteresis. That is, if A now grows starting at values 
below Ai , the network does not revert to the sparse network at 
the latter threshold. Rather, it remains in a dense state up to a 
larger value A2 > Ai, sustained by the same positive feedback 
discussed above. 

This phenomenology, though induced by a different mech- 
anism, is quite similar in spirit to that reported in Fig. [Qfor the 
previous model. In the limit ft — ► 00, the second model be- 
comes equivalent to the first one since with ft — ► 00, all nodes 
in the same component share the same value of Xi(t), whilst 
the probability to link two disconnected nodes is e = 1/q. In 
fact, the roles of 1/ft and A in the model are analogous. If 
we fix A and parametrize the behavior of the model through 
1/ft, the same phenomena of discontinuous transitions, hys- 
teresis, and equilibrium co-existence occurs for corresponding 
threshold values l/ft\ and 1 / ft^, analogous to Ai and A2 in the 
former discussion. 

Finally, we consider a setup where dij reflects proximity 
of nodes i, j in terms of some continuous (non-negative) real 
attributes, Wi(t), Wj(t). These attributes could represent the 
level of technical expertise of two firms involved in an R&D 
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partnership, or the competence of two researchers involved 
in a joint project. It could also be a measure of income or 
wealth that bears on the quality and prospects of a bilateral 
relationship. Whatever the interpretation, it may be natural 
in certain applications to posit that some process of diffusion 
tends to equalize the levels displayed by neighboring agents. 
This idea is captured by the following stochastic differential 
equation: 



dt 



= V Y, (*) - w i (*)] + w i V)m (*) ( 3 ) 



where ?7i(t) is uncorrelated white noise, i.e. (i7i(i)%(i')) = 
D6ijS(t — t'). The first term of Eq. Q describes the diffusion 
component of the process, which draws the levels of neigh- 
boring agents closer. This homogenizing force competes with 
the random idiosyncratic growth term Wi(t)rji(t). Random 
growth processes subject to diffusion such as that of Eq. 
are well known in physics. In particular it is known I2al that 
the fluctuation properties of Eq. l|3} when D is larger than 
a critical value D c are qualitatively different to those when 
D < D c . 

Choosing d^ = | log Wi — log Wj | and updating both the 
links and Ws at comparable timescales, we have performed 
extensive numerical simulations of the induced network dy- 
namics. Fig. |2]reports typical results for a simple discretized 
version of Eq. Q with D > D c (see caption of Fig. As 
in the two previous models, we find a discontinuous transition 
between a sparse and a dense network state, characterized by 
hysteresis effects. When the network is sparse, diffusion is in- 
effective in homogenizing growth. Hence the distance dij is 
typically beyond the threshold d, thus slowing down the link 
formation process. On the other hand, with a dense network, 
diffusion rapidly succeeds in narrowing the gaps between the 
Wi& of different nodes, which in turn has a positive effect on 
network formation. As before, the phase transition and hys- 
teresis is a result of the positive feedback that exists between 
the dynamics of the Wi and the adjustment of the network. In 
the stationary state we find that W(t) = (Wi(t)) grows ex- 
ponentially in time, i.e. log Wi(t) ~ vt. Notably, the growth 
process is much faster (i.e. v is much higher) in the dense 
network equilibrium than in the sparse one, as shown in the 
upper panel of Fig. |2] 

Finally, we note that when diffusion is very strong com- 
pared to the idiosyncratic shocks in Eq. - i.e. v VD - 
we expect a much smaller distance dij between agents in the 
same component compared to agents in different components. 
Thus the model becomes similar to the first one in this limit, 
in the same way the second model did for j3 — > oo. 



III. CONCLUSION 

In this paper we have proposed a general theoretical setup to 
study the dynamics of a social network that is flexible enough 
to admit a wide variety of particular specifications. We have 
studied three such specifications, each illustrating a distinct 
way in which the network dynamics may interplay with the 
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FIG. 2: Mean degree c (top) and growth rate v (bottom) as a func- 
tion of A computed in numerical simulations of a discretized version 
of the model with Eq. More precisely, we iterate the equation 
hi (t + At) = maxj hj (t) + rj where j runs on the local neigh- 
borhood of i, including i, At is a small time interval, and n (t) is a 
Gaussian variable with mean and variance At. This equation de- 
scribes the strong-noise limit of Eq. and it is obtained by setting 
hi = D~ 1/2 logWi when D > 1 (and D > [log(uAt)] 2 ). Here 
we use At = 0.2, e = 0.01, d = OAyHD and n = 800 (solid circles) 
and 1600 (open diamonds). 



adjustment of node attributes. In all these cases, network 
evolution displays the three features (sharp transitions, re- 
silience, and equilibrium co-existence) that empirical research 
has found to be common to many social-network phenomena. 
Our analysis indicates that these features arise as a conse- 
quence of the cumulative self-reinforcing effects induced by 
the interplay of two complementary considerations. On the 
one hand, there is the subprocess by which agent similarity 
is enhanced across linked (or close-by) agents. On the other 
hand, there is the fact that the formation of new links is much 
easier between similar agents. When such a feedback process 
is triggered, it provides a powerful mechanism that effectively 
offsets the link decay induced by volatility. 

The similarity-based forces driving the dynamics of the 
model are at work in many socio-economic environments. 
Thus, even though fruitful economic interaction often requires 
that the agents involved display some "complementary diver- 
sity" in certain dimensions (e.g. buyers and sellers), a key 
prerequisite is also that agents can coordinate in a number of 
other dimensions (e.g. technological standards or trading con- 
ventions). Analogous considerations arise as well in the evo- 
lution of many other social phenomena (e.g. the burst of social 
pathologies discussed above) that, unlike what is claimed e.g. 
by Crane JJ], can hardly be understood as a process of epi- 
demic contagion on a given network. It is by now well under- 
stood 11281 12911 that such epidemic processes do not match the 
phenomenology (a)-(c) reported in empirical research. Our 
model suggests that a satisfactory account of these phenom- 
ena must aim at integrating both the dynamics on the net- 
work with that of the network itself as part of a genuinely 
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co-evolutionary process. 

IV. APPENDIX 

We characterize the long run behavior of the network in 
terms of the stationary degree distribution P(k), which is the 
fraction of agents with k neighbors. This corresponds to ap- 
proximating the network with a random graph (see J25|), an 
approximation which is rather accurate in the cases we dis- 
cuss here. We focus on the limit n — > oo, for which the 
analysis is simpler, but finite size corrections can be stud- 
ied within this same approach. The degree distribution sat- 
isfies a master equation |27], which is specified in terms of 
the transition rates w(k — > k ± 1) for the addition or re- 
moval of a link, for an agent linked with k neighbors. While 
w(k — > k — 1) = Xk always takes the same form, the transi- 
tion rate w(k — > k + 1) for the addition of a new link depends 
on the particular specification of the distance dij. For the first 
model w(k — > k + 1) = e if the two agents are in different 
components and w(k — * k + 1) = 1 if they are in the same. In 
the large n limit the latter case only occurs with some proba- 
bility if the graph has a giant component Q which contains a 
finite fraction 7 of nodes. For random graphs (see Ref. J25ll 
for details) the fraction of nodes in Q is given by 7 = 1 — <f>(u) 
where </>(s) = ^ fc P(k)s k is the generating function and u is 
the probability that a link, followed in one direction, does not 
lead to the giant component. The latter satisfies the equation 
u = (p'(u)/<p'(l). Hence u k is the probability an agent with 
k neighbours has no links connecting him to the giant compo- 
nent, and hence is itself not part of the giant component. Then 
the rate of addition of links, in the first model, takes the form 

w(k -> k + 1) = 2[e + (1 - e)7(l - u k )], 

where the factor 2 comes because each node can either initiate 
or receive a new link. The stationary state condition of the 
master equation leads to the following equation for 4>{s) 

M'{s) = 2[e + (1 - e)i\4>(a) - 2(1 - e)~t<j>{us) (4) 

which can be solved numerically to the desired accuracy. No- 
tice that Eq. is a self-consistent problem, because the pa- 
rameters 7 and u depend on the solution <fi(s). The solution 
of this equation is summarized in Fig. [2 Either one or three 
solutions are found, depending on the parameters. In the lat- 
ter case, the intermediate solution is unstable (dashed line in 
Fig. and it separates the basins of attraction of the two 
stable solutions within the present mean field theory. Numeri- 
cal simulations reveal that the the mean field approach is very 
accurate away from the phase transition although it overesti- 
mates the size of the coexistence region. 



Now we turn to the second model, where each node dis- 
plays one out of a finite set of attributes. In order to simplify 
the analysis, we approximate the prevailing network g with 
a random graph with Poisson degree distribution and average 
degree c, i.e. a graph where any given link ij is present with 
probability c/(n— 1). Though not exact, this approximation 
is rather accurate as confirmed by numerical simulations, and 




FIG. 3: Graphical solution for the stationary state of the coordination 
model for q — 10 and (3 = 8. 



it allows us to clarify the behavior of the model in a simple 
and intuitive way. (A more precise solution, which relies on a 
more accurate description of the network topology can also be 
derived, yielding no essential differences.) The solution of the 
Potts model on random graphs of Ref. 1 23, 24] (with tempera- 
ture T = l/(2fce/3)) allows us to compute the probability that 
two randomly chosen nodes i and j have xi = Xj . Given the 
Poisson approximation, such a probability is given by a func- 
tion 7r(c, 0) — (S Xi}Xj ) of the average degree c and (3, as plot- 
ted in Fig. [5] Equalizing the link destruction and formation 
rate Ac/2 = tt(c,(3) yields an equation for the equilibrium 
values of c, for any given (3 . A graphical approach shows 
that when A > A2 there is a single solution, representing a 
sparse network. At A2 two other solutions arise, one of which 
is unstable as above. At a further point Ai the sparse-network 
solution merges with the unstable one and both disappear for 
A < Ai, leaving only a solution with a stable and dense net- 
work. This reproduces the same phenomenology observed in 
the numerical simulations of the second model, which is also 
qualitatively similar to that presented in Fig. ^for the fi rst 
model. 
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